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Low-dimensional Models in Spatio-Temporal Wind Speed Forecasting 
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Abstract —Integrating wind power into the grid is challeng¬ 
ing because of its random nature. Integration is facilitated 
with accurate short-term forecasts of wind power. The paper 
presents a spatio-temporal wind speed forecasting algorithm that 
incorporates the time series data of a target station and data 
of surrounding stations. Inspired by Compressive Sensing (CS) 
and structured-sparse recovery algorithms, we claim that there 
usually exists an intrinsic low-dimensional structure governing 
a large collection of stations that should be exploited. We cast 
the forecasting problem as recovery of a block-sparse signal 
x from a set of linear equations b = Ax for which we 
propose novel structure-sparse recovery algorithms. Results 
of a case study in the east coast show that the proposed 
Compressive Spatio-Temporal Wind Speed Forecasting (CST- 
WSF) algorithm significantly improves the short-term forecasts 
compared to a set of widely-used benchmark models. 

I. Introduction 

A. Variable Energy Resources 

Many countries in the world as well as many states in the 
U.S. have mandated aggressive Renewable Portfolio Stan¬ 
dards (RPSs). Among different renewable energy resources, 
wind energy itself is expected to grow to provide between 15 
to 25% of the world’s global electricity by 2050. According 
to another study, the world total wind power capacity has 
doubled every three years since 2000, reaching an installed 
capacity of 197 GW in 2010 and 369 GW in 2014 [1], [2], 
The random nature of wind, however, makes it difficult to 
achieve the power balance needed for its integration into the 
grid [3], [4], The use of ancillary services such as frequency 
regulation and load following to compensate for such imbal¬ 
ances [5]—[8] is facilitated by accurate forecasts [9], [10], 

B. Wind Energy Forecasting Methods 

One can directly attempt to forecast wind power. An 
alternative approach is to forecast the wind speed and then 
convert it to wind power using given power curves. This 
approach will accommodate different wind turbines installed 
in a wind farm experiencing the same wind speed but 
resulting in different wind power generation. We focus on 
wind speed forecasting in this paper. Wind speed forecasting 
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methods can be categorized to different groups: (i) model- 
based methods such as Numerical Weather Prediction (NWP) 
vs. data-driven methods, (ii) point forecasting vs. proba¬ 
bilistic forecasting, and (iii) short-term forecasting vs. long¬ 
term forecasting. This paper is concerned with short-term 
point forecasting using both temporal data as well as spatial 
information. For a more complete survey of wind speed 
forecasting methods see [11] and [12] among others. 

C. Spatio-Temporal Wind Speed Forecasting 

There is a growing interest in the so-called spatio-temporal 
forecasting methods that use information from neighboring 
stations to improve the forecasts of a target station, since 
there is a significant cross-correlation between the time series 
data of a target station and its surrounding stations. We 
review some of the spatio-temporal forecasting methods. 
Gneiting et al. [13] introduced the Regime Switching Space- 
Time Diurnal (RSTD) model for average wind speed data 
based on both spatial and temporal information. This method 
was later improved by Hering and Genton [14] who incor¬ 
porated wind direction in the forecasting process by intro¬ 
ducing Trigonometric Direction Diurnal (TDD) model. Xie et 
al. [15] also considered probabilistic TDD forecast for power 
system economic dispatch. Dowell et al. [16] employed 
a multi-channel adaptive filter to predict the wind speed 
and direction by taking advantages of spatial correlations at 
numerous geographical sites. He et al. [17] presented Markov 
chain-based stochastic models for predictions of wind power 
generation after characterizing the statistical distribution of 
aggregate power with a graph learning-based spatio-temporal 
analysis. Regime-switching models based on wind direction 
are studied by Tastu et al. [18] where they consider various 
statistical models, such as ARX models, to understand the 
effects of different variables on forecast error characteristics. 
A methodology with probabilistic wind power forecasts in 
the form of predictive densities taking the spatial informa¬ 
tion into account was developed in [19], Sparse Gaussian 
Conditional Random Fields (CRFs) have also been deployed 
for probabilistic wind power forecasting [20]. See [21] for a 
comprehensive review of the state-of-the-art methods. 

D. Our Contribution 

Inspired by Compressive Sensing (CS) and structured- 
sparse recovery algorithms, we claim that there usually 
exists an intrinsic low-dimensional structure governing the 
interactions among a large collection of weather stations. 
Such low-dimensional models should be exploited in the 
forecasting process. To this end, we cast the forecasting 
problem as the recovery of a block-sparse signal x from a 
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set of linear equations b = Ax for which we propose novel 
structure-sparse recovery algorithms. As a case study, we 
apply our proposed forecasting algorithm to the data recorded 
from 57 measuring locations (a combination of airports and 
weather stations) in a region in the east coast including 
Massachusetts, Connecticut, New York, and New Hampshire. 
The results lead to a considerable improvement of the short¬ 
term forecasts compared to a set of widely-used benchmark 
models and advanced spatio-temporal approaches. 


E. Paper Organization 

In Section [D] we formulate the forecasting problem. The 
proposed forecasting algorithms and related concepts are 
presented in Section m We apply the proposed methods 
to real wind speed data and compare the results with other 
benchmark methods in Section IV Section [V] presents our 
concluding remarks and possible future directions. 


II. Multivariate Autoregressive (M-AR) Model 

Autoregressive (AR) models assume that the output vari¬ 
able of a system can be well presented as a weighted linear 
combination of its own previous values. Multivariate Au¬ 
toregressive (M-AR) (a.k.a.. Vector Autoregressive) models 
generalize this approach to multivariate time series. Let 
y(t) £ R p be a P-dimensional output measurement (e.g., 
wind speeds at P weather stations) at time t. An M-AR 
model of order n is written as 

y(t) = X\y(t - 1) H-b X n y(t - n) + e(t) 

j=i 

where Xj £ R PxP is a coefficient matrix associated with the 
j-th time lag and e(t) is a Gaussian noise. Using a different 
notation, let y\ be the wind speed of the *-th station at sample 
time t for t = 1,2,..., M + n. For each station, the M-AR 
model (|2]i can be re-written in a matrix-vector product format 
as in <|lju where N := nP. In the training stage, the goal is 
to find a coefficient vector x £ R v that best explains the 
observations b £ R M and A £ R MxN . As seen from (jTJ, 
x has a block structure as the coefficients corresponding to 
each station appear in one vector-block. 


III. Compressive Spatio-Temporal 
Wind Speed Forecasting (CST-WSF) 

We believe that among a large collection of stations, only 
a few of them have a strong correlation with the target 
station. We show that under the assumption of sparsity of 
the interconnections (that is, assuming only a few weather 
stations contribute to the output of the target station), there 
will be a distinct structure to the solution x that we are 
seeking. In particular, a typical coefficient vector x under 
our model assumptions will have very few non-zero entries, 
and these non-zero entries will be clustered in few locations. 
Vectors with such structure are called block-sparse. The 
number of blocks corresponds to the number of links that 
contribute to the output of the target station. For a given 
target station, we then solve the minimization problem: 

min ||& — Ax \\2 subject to (tc is block-sparse). (3) 

X 

We call this approach Compressive Spatio-Temporal Wind 
Speed Forecasting (CST-WSF), as it is inspired by 
Compressive Sensing (CS) and structured-sparse recovery. 

A. Background on CS 

CS enables recovery of an unknown signal from its un¬ 
derdetermined set of measurements under the assumption of 
sparsity of the signal and under certain conditions on the 
measurement matrix A [22]. The CS recovery problem can 
be viewed as recovery of a A'-sparse signal x £ R v from 
its observations b = Ax £ R M where A £ R MxN is the 
measurement matrix with M < N (in many cases M <C N). 
A Jv-sparse signal x £ R N is a signal of length N with K 
non-zero entries where K < N. Since the null space of A 
is non-trivial, there are infinitely many candidate solutions 
to the equation b = Ax\ however, CS recovery algorithms 
exploit the fact that, under certain conditions on A, only one 
candidate solution is suitably sparse. An interested reader can 
refer to [23], [24] for several proposed recovery conditions. 

B. Uniform CST-WSF 

We adapt our CST-WSF algorithm from tools proposed in 
the CS literature for recovery of a block-sparse signal x. 














Definition 1 (Block K-Sparse Signal): Let x £ be a 
concatenation of P vector-blocks Xi £ R n , i.e., 


where N = nP. A signal x £ M. N is called block K-sparse 
if it has K < P non-zero blocks. □ 

Several extensions of the standard CS recovery algorithms 
can account for additional structure in the sparse signal to 
be recovered [25], [26]. Among these, the Block Orthogonal 
Matching Pursuit (BOMP) algorithm [26] is designed to 
exploit block sparsity due to its flexibility in recovering 
block-sparse signals of different sparsity levels and its low 
computation complexity [27], In a more general setting, 
BOMP has been recently used for topology identification 
of interconnected dynamical systems (e.g., see [28]). 

C. Nonuniform CST-WSF 

In a uniform CST-WSF, the assumption is that a uniform 
M-AR model as given in (|Tji is governing the interactions 
between stations. In other words, we assume that the target 
station and its surrounding stations are related by AR models 
of the same order. In this section, we consider a more gen¬ 
eralized version of the CST-WSF algorithm where the target 
station and its surrounding stations are related by AR models 
of different orders. This model structure, called Nonuniform 
Multivariate Autoregressive (NM-AR) distinguishes between 
the stations with high and low cross-correlation with the 
target station. Let n t be the order associated with the «-th 
station for i = 1, 2,..., P. An NM-AR version of (|TJ can 
be considered as given in (|4|, where n max > max, n, and 
N := n i ■ This results in a nonuniform block-sparse 

coefficient vector x whose blocks have different length. 

Definition 2 (Nonuniform Block K-Sparse Signal): Let 
x £ R W as a concatenation of P vector-blocks Xi £ R Wi 
where N = X^f=i n i • A signal x £ R w is called nonuniform 
block K-sparse if it has K < P non-zero blocks. □ 

Remark 1: One should note that our definition of nonuni¬ 
form block A'-sparse signals is a generalization of the 
conventional definition of block /f-sparse signals (Definition 
[TJ where all blocks have the same length, i.e., n, = n, Vi. □ 

Given {rii}? =1 , the BOMP algorithm can be used for 
recovery of x with Ai £ R Mxrti . In order to find the set 
of order L 1; we use a correlation analysis. We then 
adjust the orders to achieve the best prediction performance. 

IV. Case Study of 57 Stations in East Coast 

We apply our proposed CST-WSF algorithms to real wind 
speed data. East coast states are good candidates for our 
study because: (i) wind speed profiles are higher and (ii) 
there are more stations in a close vicinity in these states. 

A. Data Description 

We use hourly wind speed data from Meteorological 
Terminal Aviation Routine (METAR) weather reports of 57 
stations in east coast including Massachusetts, Connecticut, 
New York, and New Hampshire [29]. Fig. [T] depicts the area 
under study and the location of these 57 stations. The target 



Fig. 1. Map of the area under study. The 57 measuring locations in east 
coast are shown with yellow points. Circled in red is the target station ACK. 

station Nantucket Memorial Airport (ACK) (circled in red) is 
located on an island and is subject to wind profiles with high 
ramps and speeds due to the fact that the surrounding surface 
has very low roughness heights. Furthermore, this area has 
good correlations with other stations owing to the fact the 
prevailing wind direction of this region is mainly northwest 
or southeast. A time period from January 6, 2014 to February 
20, 2014 is considered in our simulations. This time period 
has the most unsteady wind conditions throughout the year. 
The data is divided to two subsets: (i) training subset from 
January 6, 2014 to February 6, 2014 (a period of 30 days) 
and (ii) validation subset from February 6, 2014 to February 
20, 2014 (a period of 14 days). 

B. Comparison with Other Benchmark Algorithms 

In order to better gauge the effectiveness of the pro¬ 
posed algorithm, we compare CST-WSF with other proposed 
benchmark algorithms in temporal and spatio-temporal wind 
speed forecasting. For temporal wind speed forecasting, we 
first consider persistence forecasting method which simply 
uses the last measured value for the forecast interval. Any 
algorithm that can improve upon persistence forecasting 
is judged to be an effective algorithm. We also consider 
AR models as well as an advanced prediction model that 
combines Wavelet Transform (WT) with Artificial Neural 
Network (ANN). The latter method is shown to have the 
capability of capturing nonlinearity in the wind speed time 
series. In this model, briefly, the volatile wind speed series is 
first cut up by the WT into a number of better-behaved sub¬ 
series in various frequency bands. Subsequently, estimates of 
each extracted sub-series are carried out separately employ¬ 
ing the ANN mode. Speed predictions are then reconstructed 
except for the highest frequency band which represents the 
most fluctuating part of the wind speed series (see [30] for 
more details). Multi-step predictions were performed in a 
recursive manner for a period of 14 consecutive days with 6 
hour-ahead updates. The prediction results are given in Fig. 
[2] As can be seen, the considered temporal prediction meth¬ 
ods provide reasonable forecasting compared to persistence 
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model. ANN-based model outperforms the AR model. 

We also consider two spatio-temporal forecasting methods. 
We first employed an advanced ANN-based spatio-temporal 
model [31]. We also employed a Least Squares (LS) M-AR 
spatio-temporal forecasting approach [18]. Fig.[3]depicts how 
incorporation of spatial information improves the forecasting 
performance as compared to temporal methods. 


C. Uniform and Nonuniform CST-WSF 

We now apply our proposed uniform and nonuniform 
CST-WSF methods. Note that in our simulations a new 
coefficient vector x is obtained every 6 hours (equivalently, 
every 6 time steps as each time step is 1 hour). This is 
the considered prediction time horizon. Also, we follow a 
recursive approach in prediction. That is, the wind speed 
predictions at time n + M + 1 for all stations (y l n+M+1 , Vi) 
are included in the A matrix for predicting the wind speed 
at time n + M + 2 (y 7 n+M+ 2 ^^ and so on - This recursive 
process goes on for 6 time steps. The elements of A are then 
completely updated with real measurements and the recursive 
process continues for another 6 time steps. 

Figure 4(a) shows the result of the uniform CST-WSF. 
The result is superior to all benchmark approaches discussed 
in the previous section. We then apply the nonuniform 


CST-WSF algorithm. The result is illustrated in Fig. 4(b) 


In order to demonstrate the effectiveness of the proposed 
forecasting models, the associated Mean Absolute Error 
(MAE) and Root Mean Squared Error (RMSE) values, which 
are the most common performance metrics in the wind 
forecasting literature, are listed in Table [I] for all of the 
wind speed forecasting methods considered in this paper. The 
MAE provides the average deviation between the measured 
and predicted data while the RMSE gives higher weights 
to larger error values by squaring the differences. More¬ 
over, the Normalized Root Mean Squared Error (NRMSE) 
which is the RMSE normalized by the range of observed 
data, is calculated to provide a scale-independent error 
measure. Evidently the proposed uniform and nonuniform 
CST-WSF methods outperform the other considered temporal 
and spatio-temporal methods. The best prediction is provided 
by the nonuniform CST-WSF. Considering the NRMSE as 


TABLE I 

Statistical Error Measure Comparison of Different Methods 


Forecasting approach 

MAE (m/s) 

RMSE (m/s) 

NRMSE (%) 

Persistence Forecasting 

2.1441 

2.8334 

16.86 

AR of order 1 

2.0742 

2.7629 

16.44 

AR of order 3 

2.0696 

2.7560 

16.40 

WT-ANN 

1.8200 

2.4671 

14.68 

ANN-based ST 

1.7981 

2.2997 

13.69 

LS-based ST 

1.7234 

2.1983 

13.08 

CST-WSF of order 3 

1.5665 

2.0595 

12.25 

Nonuniform CST-WSF 

1.3442 

1.7586 

10.46 


an example, nonuniform CST-WSF approach provides a 
reduction of 38%, 36.2% and 28.7% as compared to the 
considered temporal methods (persistence forecasting, AR of 
order 3, and WT-ANN models) and a reduction of 24% and 
20% as compared to the considered spatio-temporal methods 
(ANN-based ST and LS-based ST), respectively. 

Fig. [5]illustrates the corresponding block-sparse coefficient 
vectors for the uniform and nonuniform CST-WSF methods 
at the training stage. As can be seen, only a few of the blocks 
in uniform and nonuniform CST-WSF are non-zero, resulting 
in a block-sparse x. This block-sparse structure appears in 
all of the other calculated coefficient vectors (with different 
block-sparsity pattern) as we move over prediction horizon 
time and further confirms our motivation behind exploiting 
the intrinsic low-dimensional models in spatio-temporal wind 
speed forecasting. It is worth noting that the proposed 
CST-WSF methods have a much shorter computational time 
as compared to other ANN-based methods and the average 
computational time for other proposed short-term forecasting 
methods in the literature [15], [32]. For instance, the total 
simulation time of nonuniform CST-WSF approach is almost 
the half of the time required for the predictions with WT- 
ANN model and approximately 15% smaller than that of LS 
M-AR spatio-temporal model in this study. 

V. Conclusion 

We proposed two spatio-temporal wind speed forecasting 
methods, called uniform and nonuniform CST-WSF. The 
methods are inspired by CS and structured-sparse recovery 























(a) Spatio-temporal ANN model 



o 1 - 1 - 1 - 1 - 1 - 1 -*— 

0 50 100 150 200 250 300 

Time (h) 

(b) AR model of order 3 


0 i-*-*- 1 - 1 - 1 -*— 

0 50 100 150 200 250 300 

Time (h) 

(b) Spatio-temporal LS M-AR model 



(c) WT-ANN model 

Fig. 2. Comparison of different temporal forecasting algorithms. 


Fig. 3. Comparison of different spatio-temporal forecasting algorithms. 



algorithms, where we claim that there usually exists an intrin¬ 
sic low-dimensional structure governing a large collection of 
stations. The results of a case study show that the proposed 
approaches considerably improves the short-term forecasts 
compared to a set of widely-used benchmark models. 

As future directions, we plan to apply the proposed 
CST-WSF to a much larger set of stations. Incorporating 
other variables (such as temperature, pressure, etc.) in the 
wind speed forecasting is another research path. Yet another 
direction is to investigate using sparsity-based ideas in prob¬ 
abilistic forecasting methods. Such information about the 
forecast errors are useful for Transmission System Operators 
(TSOs), Independent Power Producers (IPPs), etc., in evalu¬ 
ating the economic and technical risks due to uncertainty. 



Fig. 4. Comparison of the uniform and nonuniform CST-WSF. 
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Fig. 5. Block-sparse coefficient vector. The red dashed lines specify 57 
vector-blocks of the coefficient vector, (a) Uniform CST-WSF of order 3. 
(b) Nonuniform CST-WSF with different orders (vector-block lengths). 
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